26 research outputs found
Cover Tree Bayesian Reinforcement Learning
This paper proposes an online tree-based Bayesian approach for reinforcement
learning. For inference, we employ a generalised context tree model. This
defines a distribution on multivariate Gaussian piecewise-linear models, which
can be updated in closed form. The tree structure itself is constructed using
the cover tree method, which remains efficient in high dimensional spaces. We
combine the model with Thompson sampling and approximate dynamic programming to
obtain effective exploration policies in unknown environments. The flexibility
and computational simplicity of the model render it suitable for many
reinforcement learning problems in continuous state spaces. We demonstrate this
in an experimental comparison with least squares policy iteration
Adaptive Submodular Influence Maximization with Myopic Feedback
This paper examines the problem of adaptive influence maximization in social
networks. As adaptive decision making is a time-critical task, a realistic
feedback model has been considered, called myopic. In this direction, we
propose the myopic adaptive greedy policy that is guaranteed to provide a (1 -
1/e)-approximation of the optimal policy under a variant of the independent
cascade diffusion model. This strategy maximizes an alternative utility
function that has been proven to be adaptive monotone and adaptive submodular.
The proposed utility function considers the cumulative number of active nodes
through the time, instead of the total number of the active nodes at the end of
the diffusion. Our empirical analysis on real-world social networks reveals the
benefits of the proposed myopic strategy, validating our theoretical results.Comment: Accepted by IEEE/ACM International Conference Advances in Social
Networks Analysis and Mining (ASONAM), 201
A Bayesian Ensemble Regression Framework on the Angry Birds Game
An ensemble inference mechanism is proposed on the Angry Birds domain. It is
based on an efficient tree structure for encoding and representing game
screenshots, where it exploits its enhanced modeling capability. This has the
advantage to establish an informative feature space and modify the task of game
playing to a regression analysis problem. To this direction, we assume that
each type of object material and bird pair has its own Bayesian linear
regression model. In this way, a multi-model regression framework is designed
that simultaneously calculates the conditional expectations of several objects
and makes a target decision through an ensemble of regression models. Learning
procedure is performed according to an online estimation strategy for the model
parameters. We provide comparative experimental results on several game levels
that empirically illustrate the efficiency of the proposed methodology.Comment: Angry Birds AI Symposium, ECAI 201
Learning Graph Representations for Influence Maximization
As the field of machine learning for combinatorial optimization advances,
traditional problems are resurfaced and readdressed through this new
perspective. The overwhelming majority of the literature focuses on small graph
problems, while several real-world problems are devoted to large graphs. Here,
we focus on two such problems: influence estimation, a #P-hard counting
problem, and influence maximization, an NP-hard problem. We develop GLIE, a
Graph Neural Network (GNN) that inherently parameterizes an upper bound of
influence estimation and train it on small simulated graphs. Experiments show
that GLIE provides accurate influence estimation for real graphs up to 10 times
larger than the train set. More importantly, it can be used for influence
maximization on considerably larger graphs, as the predictions ranking is not
affected by the drop of accuracy. We develop a version of CELF optimization
with GLIE instead of simulated influence estimation, surpassing the benchmark
for influence maximization, although with a computational overhead. To balance
the time complexity and quality of influence, we propose two different
approaches. The first is a Q-network that learns to choose seeds sequentially
using GLIE's predictions. The second defines a provably submodular function
based on GLIE's representations to rank nodes fast while building the seed set.
The latter provides the best combination of time efficiency and influence
spread, outperforming SOTA benchmarks.Comment: 2
Boosting Tricks for Word Mover's Distance
Due to the COVID-19 pandemic, the physical meeting of ICANN 2020 has been postponed. The event is scheduled next year’s ICANN in September 2021 in Bratislava, Slovakia.International audienceWord embeddings have opened a new path in creating novel approaches for addressing traditional problems in the natural language processing (NLP) domain. However, using word embeddings to compare text documents remains a relatively unexplored topic-with Word Mover's Distance (WMD) being the prominent tool used so far. In this paper, we present a variety of tools that can further improve the computation of distances between documents based on WMD. We demonstrate that, alternative stopwords, cross document-topic comparison, deep contextualized word vectors and convex metric learning, constitute powerful tools that can boost WMD
ABC Reinforcement Learning
We introduce a simple, general framework for likelihood-free Bayesian reinforcement learning, through Approximate Bayesian Computation (ABC). The advantage is that we only require a prior distribution on a class of simulators. This is useful when a probabilistic model of the underlying process is too complex to formulate, but where detailed simulation models are available. ABC-RL allows the use of any Bayesian reinforcement learning technique in this case. It can be seen as an extension of simulation methods to both planning and inference. We experimentally demonstrate the potential of this approach in a comparison with LSPI. Finally, we introduce a theorem showing that ABC is sound. 1